35 results found.
Speech/Written
Corpus,
Language Type:
Multilingual
Languages:
Amharic Bengali Cantonese Georgian Javanese Lao Vietnamese Zulu
Availability:
From Data Center(s)
License:
LDC
Size:
80 GByteProduction Status:
Existing-used
Use:
Speech Recognition/Understanding
-
Paper title:Zero-shot Cross-Lingual Phonetic Recognition with External Language Embedding
-
Paper track:8.11 Cross-lingual and multilingual/accent aspects/Poster Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Heting Gao | IARPA Babel Language Pack | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Multilingual
Languages:
Amharic Javanese Kazakh Tagalog Turkish Vietnamese
Availability:
License:
Size:
300 hoursProduction Status:
Existing-used
Use:
Speech Recognition/Understanding
-
Paper title:Differentiable Allophone Graphs for Language Universal Speech Recognition
-
Paper track:9.8 Cross-lingual and multilingual components for /Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Brian Yan | IARPA Babel Language Pack | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Multilingual
Languages:
Cantonese Indonesian Japanese Kazakh Korean Mandarin Russian Tibetan Uyghur Vietnamese
Availability:
From Owner
License:
Speechocean and Center for Speech and LanguageTechnologies (Tsinghua University)
Size:
None GByteProduction Status:
Existing-used
Use:
Language Identification
-
Paper title:Language recognition on unknown conditions: the LORIA-Inria-MULTISPEECH system for AP20-OLR Challenge
-
Paper track:14.4 Oriental Langauge Recognition/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Raphaël Duroselle | Oriental Language Recogntion challenge 2020 corpus | /N |
Documentation:
Evaluation plan paper
Speech
Corpus,
Language Type:
Multilingual
Languages:
Arabic English Farsi French German Hindi Japanese Korean Mandarin Russian Spanish Tamil Vietnamese
Availability:
From Owner
License:
LDC
Size:
46 hoursProduction Status:
Existing-used
Use:
Language Identification
-
Paper title:Modeling and training strategies for language recognition systems
-
Paper track:4.1 Language identification and verification, lang/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Raphaël Duroselle | 2003 NIST Language Recognition Evaluation | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Multilingual
Languages:
Arabic Bengali Dari English German Hindi Iranian Persian Japanese Korean Mandarin Chinese Persian Russian Spansih Standard Arabic Tamil Thai Vietnamese Yue Chinese
Availability:
From Owner
License:
LDC
Size:
66 hoursProduction Status:
Existing-used
Use:
Language Identification
-
Paper title:Modeling and training strategies for language recognition systems
-
Paper track:4.1 Language identification and verification, lang/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Raphaël Duroselle | 2007 NIST Language Recognition Evaluation Test Set | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Multilingual
Languages:
Amharic Bosnian Croatian Dari English French Georgian Haitian Hausa Hindi Korean Mandarin Chinese Persian Portuguese Pushto Russian Spanish Turkish Ukrainian Urdu Vietnamese Yue Chinese
Availability:
From Owner
License:
LDC
Size:
215 hoursProduction Status:
Existing-used
Use:
Language Identification
-
Paper title:Modeling and training strategies for language recognition systems
-
Paper track:4.1 Language identification and verification, lang/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Raphaël Duroselle | 2009 NIST Language Recognition Evaluation Test Set | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Multilingual
Languages:
Cantonese English French German Gishu Greek Gujarati Hebrew Hindi Indonesian Japanese Korean Mandarin Persian Portuguese Runyankore Russian Spanish Turkish Vietnamese
Availability:
Freely Available
License:
OpenSource
Size:
22.8 GByte Production Status:
Newly created-in progress
Use:
Speech Recognition/Understanding
-
Paper title:Speaking rate, information density, and information rate in first-language and second-language speech
-
Paper track:1.10 Bilingual and L2 acquisition and processing/Oral Presentation
-
Paper status:Accept - Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Ann Bradlow | The ALLSSTAR Corpus | /N |
Documentation:
Documentation in English is available to the public (via the project website)
Written
Corpus,
Language Type:
Monolingual
Languages:
Vietnamese
Availability:
From Owner
License:
Size:
5,922 sentences Production Status:
Newly created-finished
Use:
Question Answering
-
Paper title:Answering Legal Questions by Learning Neural Attentive Text Representation
-
Paper track:Long paper/
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Ngo Xuan Bach | Vietnamese Legal QA | /N |
Documentation:
None
Written
Corpus,
Language Type:
Multilingual
Languages:
English Indonesian Japanese Mandarin Chinese Vietnamese
Availability:
Freely Available
License:
CC BY
Size:
7093 Production Status:
Existing-updated
Use:
Word Sense Disambiguation
-
Paper title:Identifying Idioms in Chinese Translations
-
Paper track:Written
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Wan Yu Ho | Nanyang Technological University | SG |
| Author 2 | Christine Kng | Saint John's College, Santa Fe | US |
| Author 3 | Shan Wang | Nanyang Technological University | MO |
| Author 4 | Francis Bond | Nanyang Technological University | SG |
| Main Contact | Francis Bond | Nanyang Technological University | None |
Documentation:
some documentation in English
Speech/Written
Corpus,
Language Type:
Multilingual
Languages:
Hausa Korean Mandarin Chinese Thai Vietnamese
Availability:
From Data Center(s)
License:
ELRA, Appen
Size:
60 GByte Production Status:
Existing-updated
Use:
Multilingual Speech Processing incl. multilingual speech recognition, rapid deployment of speech processing systems, language identification,
-
Paper title:GlobalPhone: Pronunciation Dictionaries in 20 Languages
-
Paper track:Speech
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Tanja Schultz | Karlsruhe Institute of Technology | DE |
| Author 2 | Tim Schlippe | Karlsruhe Institute of Technology | DE |
| Main Contact | Tanja Schultz | Universität Bremen | None |
Documentation:
yes; English; in part in various publications from our group




